-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column data type handling #5
Conversation
- make the warnings visible - warn if no MSVC edition is found
+ BUGH()
The 'SYS TYPES' query is now executed right after connectivity check part of SQLDriverConnect(). The result is cached on the connection (es_type array member, one element for each type). When a query is executed and the results processed, the column data type sent by ES will be used to associate the right es_type element to the record associated with the column. This allows stripping the record structure from members that are only used for data type characterization, thus saving memory, as well as "transparency", in case new types are added or existing changed. (Some change would still be needed in the driver, due to way ODBC specifies "size" and "precision" of a column.)
When returning strings to the application, ODBC requires to include, but don't count the 0-terminator. This fixes the counting part (it was counted before)..
make use of cached data types when returning info to application, strip them record structure.
s/x-pack-odbc/elasticsearch-sql-odbc/
Use cached ES return type for these record members, in case they belong to an IRD descriptor.
Use the values provided by server in cached ES/SQL types to populate the IRD type members (away from SQL C types). This cleans up the implementation a bit and will allows copying of the descriptors later, if needed.
- placed in a header, to easy find and change if needed
- this is a r/o IRD specific field => moved it to es_type; - implement the Display Size calculation.
- fixed/finished the logic for these two measures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me Bogdan although I would prefer that load_es_types
was refactored to aid readability
driver/connect.c
Outdated
goto end; | ||
} | ||
|
||
/* check row statues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: statuses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. (these rows are indeed not that monumental.)
driver/connect.c
Outdated
} | ||
|
||
/* check row statues; | ||
* calculate the lenght of all strings (SQLWCHAR members) returned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length.. (your favourite typo!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. (just ran a grep, this typo is occurring 40+ times..)
driver/connect.c
Outdated
@@ -1499,7 +2105,7 @@ SQLRETURN EsSQLGetConnectAttrW( | |||
case SQL_ATTR_CURRENT_CATALOG: | |||
DBGH(dbc, "requested: catalog name (@0x%p).", dbc->catalog); | |||
#if 0 | |||
if (! dbc->conn) { | |||
if (! dbc->es_types) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer that we didn't have unreachable code blocks such as this one. Could it safely be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this block to remove/update the dead code.
* "manually" copy from a row structure (defined into the function) into the | ||
* estype structure. The array of estype structs is returned. | ||
*/ | ||
static BOOL load_es_types(esodbc_dbc_st *dbc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is very long! Is there any it could be broken into smaller, more easily digestible chunks?
An initial suggestion would be to move the macro and structure definitions out - although I do realise that this would break the scoping model you've used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've broken the function down.
I didn't want initially to have another structure/type declared, even though I didn't like the length of the resulted function either. But I think you're ultimately right, having it smaller/readable is a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main comment is just a question. As long as it's been thought through that's fine.
driver/handles.c
Outdated
* comment at function end), so these values must stay in sync, since there | ||
* are no C corresponding defines for verbose and sub-code (i.e. nothing like | ||
* "SQL_C_DATETIME" or "SQL_C_CODE_DATE"). | ||
* The identity does not hold across the bord, though (extended values, like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: bord -> board
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks.
#define ESODBC_SQL_NULL 0 | ||
#define ESODBC_SQL_UNSUPPORTED 1111 | ||
#define ESODBC_SQL_OBJECT 2002 | ||
#define ESODBC_SQL_NESTED 2002 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also geo and specialised types.
Do they get converted to "unsupported" on the Java side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the specialised types are sent as UNSUPPORTED currently, yes (at least the IP type is). Not sure where geo would fit, but I guess it's treated similarly.
- mostly s/lenght/length
The sign-explicit defs have been added to replace the implicitly signed ones.
Update the logic that gets the catalog name: there's a new SYS CATALOGS available system query that can be used, so the way to implement it in the driver is now clear.
BIGINT is a new type, which are all explicitely signed/unsigned. (not sure how it slipped in previous commit, past a pre-commit compile)
load_es_types() got too large.
Once a query is executed, the client-app queries the driver for a set of details related to the results it got: how many columns it got, number of rows, types of the columns.
Column type is mapped onto a descriptor, IRD (Implementation Row Descriptor). The details available here are quite numerous: SQL type of data contained in the column, size ("precision"), decimal digits ("scale"), nullable, type name etc., as well as table [base] name, column alias a.s.o.
Generally with DBMSes, the data types can be either fixed SQL data types (ex. INTEGER) or derived, column-specific (like VARCHAR(10), FLOAT(precision, scale)). So, typically, a driver would need to receive from the data source a verbose description of the result set for every query.
However, ES/SQL only supports "fixed" data types currently (i.e. one can not "create" a table/index with specific, non-general types). So the driver takes advantage of this, by querying Elasticsearch for the data types when it first connects to and then cache these (per connection). Then, for each application-specific query result set, the driver will simply refer to the cached types when the app requests details about the columns data types. This is what this PR is mostly about.
Collaterally, the four column data-type concepts - size, dec-digits, octet-len and display size - are finished/fixed.